用于在线状态估计的随机过滤器是自治系统的核心技术。此类过滤器的性能是系统能力的关键限制因素之一。此类过滤器的渐近行为(例如,用于常规操作)和瞬态响应(例如,对于快速初始化和重置)对于保证自主系统的稳健操作至关重要。本文使用n个方向测量值(包括车身框架和参考框架方向类型测量值)引入了陀螺仪辅助姿态估计器的新通用公式。该方法基于一种集成状态公式,该公式结合了导航,所有方向传感器的外部校准以及在单个模棱两可的几何结构中的陀螺式偏置状态。这种新提出的对称性允许模块化的不同方向测量及其外部校准,同时保持在同一对称性中包括偏置态的能力。随后使用此对称性的基于滤波器的估计量明显改善了瞬态响应,与最新方法相比,渐近偏置和外部校准估计。估计器在统计代表性的模拟中得到了验证,并在现实世界实验中进行了测试。
translated by 谷歌翻译
事件摄像机是受到生物启发的动态视觉传感器,它们以高时间分辨率,高动态范围和低延迟响应图像强度的变化。这些传感器特性非常适合与智能视觉信标的广播视觉通信频道一起启用视觉目标跟踪,并在分布式机器人技术中应用。视觉信标可以通过对发射二极管(LED)的高频调节(例如车辆前大灯,物联网(IoT)LED,智能建筑灯等)来构建,这些灯光已经存在于许多真实世界中。事件摄像机的高时间分辨率特征使他们能够以基于经典的框架摄像机的速度捕获更高数据速率的视觉信号。在本文中,我们提出了一种具有LED调制和事件摄像头解调算法的新型智能视觉标准架构。我们定量评估我们原型型的智能视觉信标通信系统的LED传输速率,通信距离和消息传输精度之间的关系。所提出的方法在室内环境中最多可实现4 kbps,并且在100米的距离内以500桶的传输速率在阳光下以500 bps的速度实现了无损的传播,这表明了该技术在室外环境中的潜力。
translated by 谷歌翻译
高性能跟踪四级车辆的控制是空中机器人技术的重要挑战。对称是物理系统的基本属性,并提供了为设计高性能控制算法提供工具的潜力。我们提出了一种采用任何给定对称性的设计方法,在一组坐标中将相关误差线性化,并使用LQR设计获得高性能控制;一种方法,我们将术语的调节器设计。我们表明,四极管车辆承认了几种不同的对称性:直接产物对称性,扩展姿势对称性和姿势和速度对称性,并表明每个对称性都可以用来定义全局误差。我们通过模拟比较线性化系统,发现扩展的姿势和姿势和速度对称性在存在大干扰的情况下优于直接产物对称性。这表明对称性对称性和组仿射对称性的选择有改善的线性化误差。
translated by 谷歌翻译
Visual Inertial Odometry (VIO) is the problem of estimating a robot's trajectory by combining information from an inertial measurement unit (IMU) and a camera, and is of great interest to the robotics community. This paper develops a novel Lie group symmetry for the VIO problem and applies the recently proposed equivariant filter. The symmetry is shown to be compatible with the invariance of the VIO reference frame, lead to exact linearisation of bias-free IMU dynamics, and provide equivariance of the visual measurement function. As a result, the equivariant filter (EqF) based on this Lie group is a consistent estimator for VIO with lower linearisation error in the propagation of state dynamics and a higher order equivariant output approximation than standard formulations. Experimental results on the popular EuRoC and UZH FPV datasets demonstrate that the proposed system outperforms other state-of-the-art VIO algorithms in terms of both speed and accuracy.
translated by 谷歌翻译
结合同时定位和映射(SLAM)估计和动态场景建模可以高效地在动态环境中获得机器人自主权。机器人路径规划和障碍避免任务依赖于场景中动态对象运动的准确估计。本文介绍了VDO-SLAM,这是一种强大的视觉动态对象感知SLAM系统,用于利用语义信息,使得能够在场景中进行准确的运动估计和跟踪动态刚性物体,而无需任何先前的物体形状或几何模型的知识。所提出的方法识别和跟踪环境中的动态对象和静态结构,并将这些信息集成到统一的SLAM框架中。这导致机器人轨迹的高度准确估计和对象的全部SE(3)运动以及环境的时空地图。该系统能够从对象的SE(3)运动中提取线性速度估计,为复杂的动态环境中的导航提供重要功能。我们展示了所提出的系统对许多真实室内和室外数据集的性能,结果表明了对最先进的算法的一致和实质性的改进。可以使用源代码的开源版本。
translated by 谷歌翻译
We present a dynamic path planning algorithm to navigate an amphibious rotor craft through a concave time-invariant obstacle field while attempting to minimize energy usage. We create a nonlinear quaternion state model that represents the rotor craft dynamics above and below the water. The 6 degree of freedom dynamics used within a layered architecture to generate motion paths for the vehicle to follow and the required control inputs. The rotor craft has a 3 dimensional map of its surroundings that is updated via limited range onboard sensor readings within the current medium (air or water). Path planning is done via PRM and D* Lite.
translated by 谷歌翻译
Nine language-vision AI models trained on web scrapes with the Contrastive Language-Image Pretraining (CLIP) objective are evaluated for evidence of a bias studied by psychologists: the sexual objectification of girls and women, which occurs when a person's human characteristics are disregarded and the person is treated as a body or a collection of body parts. A first experiment uses standardized images of women from the Sexual OBjectification and EMotion Database, and finds that, commensurate with prior research in psychology, human characteristics are disassociated from images of objectified women: the model's recognition of emotional state is mediated by whether the subject is fully or partially clothed. Embedding association tests (EATs) return significant effect sizes for both anger (d >.8) and sadness (d >.5). A second experiment measures the effect in a representative application: an automatic image captioner (Antarctic Captions) includes words denoting emotion less than 50% as often for images of partially clothed women than for images of fully clothed women. A third experiment finds that images of female professionals (scientists, doctors, executives) are likely to be associated with sexual descriptions relative to images of male professionals. A fourth experiment shows that a prompt of "a [age] year old girl" generates sexualized images (as determined by an NSFW classifier) up to 73% of the time for VQGAN-CLIP (age 17), and up to 40% of the time for Stable Diffusion (ages 14 and 18); the corresponding rate for boys never surpasses 9%. The evidence indicates that language-vision AI models trained on automatically collected web scrapes learn biases of sexual objectification, which propagate to downstream applications.
translated by 谷歌翻译
Algorithms that involve both forecasting and optimization are at the core of solutions to many difficult real-world problems, such as in supply chains (inventory optimization), traffic, and in the transition towards carbon-free energy generation in battery/load/production scheduling in sustainable energy systems. Typically, in these scenarios we want to solve an optimization problem that depends on unknown future values, which therefore need to be forecast. As both forecasting and optimization are difficult problems in their own right, relatively few research has been done in this area. This paper presents the findings of the ``IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling," held in 2021. We present a comparison and evaluation of the seven highest-ranked solutions in the competition, to provide researchers with a benchmark problem and to establish the state of the art for this benchmark, with the aim to foster and facilitate research in this area. The competition used data from the Monash Microgrid, as well as weather data and energy market data. It then focused on two main challenges: forecasting renewable energy production and demand, and obtaining an optimal schedule for the activities (lectures) and on-site batteries that lead to the lowest cost of energy. The most accurate forecasts were obtained by gradient-boosted tree and random forest models, and optimization was mostly performed using mixed integer linear and quadratic programming. The winning method predicted different scenarios and optimized over all scenarios jointly using a sample average approximation method.
translated by 谷歌翻译
We apply the vision transformer, a deep machine learning model build around the attention mechanism, on mel-spectrogram representations of raw audio recordings. When adding mel-based data augmentation techniques and sample-weighting, we achieve comparable performance on both (PRS and CCS challenge) tasks of ComParE21, outperforming most single model baselines. We further introduce overlapping vertical patching and evaluate the influence of parameter configurations. Index Terms: audio classification, attention, mel-spectrogram, unbalanced data-sets, computational paralinguistics
translated by 谷歌翻译
Common to all different kinds of recurrent neural networks (RNNs) is the intention to model relations between data points through time. When there is no immediate relationship between subsequent data points (like when the data points are generated at random, e.g.), we show that RNNs are still able to remember a few data points back into the sequence by memorizing them by heart using standard backpropagation. However, we also show that for classical RNNs, LSTM and GRU networks the distance of data points between recurrent calls that can be reproduced this way is highly limited (compared to even a loose connection between data points) and subject to various constraints imposed by the type and size of the RNN in question. This implies the existence of a hard limit (way below the information-theoretic one) for the distance between related data points within which RNNs are still able to recognize said relation.
translated by 谷歌翻译